Storing the State in a Remote Backend

Learn how to create a storage bucket to hold and save the state of our cluster.

Creating the AWS S3 bucket#

Terraform maintains its internal information about the current state. That allows it to deduce what needs to be done and converge the actual into the desired state defined in *.tf files. Currently, that state is stored locally in the terraform.tfstate file. For now, there shouldn’t be anything exciting in it. Let’s see the definition of terraform.tfstate.

Definition of terraform.tfstate

The field that really matters is resources. It’s empty because we didn’t define any. We’ll do that soon, but we’re not going to create anything related to our EKS cluster. At least not right away. What we need right now is a storage bucket.

Keeping Terraform’s state local is a bad idea. If it’s on a laptop, we can’t allow others to modify the state of our resources. We’d need to send them the terraform.tfstate file by email, keep it on some network drive, or implement some other similar solution. That’s impractical.

We might be tempted to store it in Git, but that’s not secure. Instead, we’ll tell Terraform to keep the state in an AWS S3 bucket. Since we’re trying to define infrastructure as code, we won’t do that by executing a shell command, nor will we go to the AWS console. We’ll tell Terraform to create the bucket. It will be the first resource Terraform manages.

Viewing storage.tf#

We’re about to explore the aws_s3_bucket module. As the name suggests, it allows us to manage AWS S3 buckets. More information is available in the aws_s3_bucket documentation definition.

Here’s the definition of our storage.tf file.

Output of storage.tf
  • We’re defining the storage bucket referenced as state on line 1. All resource entries are followed with a type (e.g., aws_s3_bucket) and a reference (e.g., state). We’ll see the usage of a reference later in upcoming definitions.
  • Just as with the provider, the resource has several fields. Some are mandatory, while others are optional and often have predefined values.
  • We’re defining the name of the bucket and the acl.
  • Also, we specify that it should be created inside a specific region. The value of one of those fields is defined as a variable (var.region). Others (those less likely to change) are hard-coded.

There’s one tiny problem we need to fix before we proceed. AWS S3 bucket names need to be globally unique. There cannot be two buckets with the same name anywhere within the same partition.

We’ll generate a unique name using the date.

Setting a unique name for the storage bucket

The name of the bucket is now, more or less, unique, and we won’t face the danger that someone else has already claimed it. The environment variable we created will be used as the name of the bucket.

AWS s3 storage bucket

Let’s apply the new definition.

The output, limited to the relevant parts, is as follows.

Output of terraform apply

In this case, we can see the full list of all the resources that will be created. The “+” sign indicates that something will be created. Under different conditions, we could also observe those that would be modified (“∼”) or destroyed ("-").

Right now, Terraform will deduce that the actual state is missing the aws_s3_bucket resource. It also shows us which properties will be used to create that resource. We defined some, while others will be known after we apply that definition.

Finally, we’re asked whether “We want to perform these actions.” We should type “Yes,” followed by the “Enter” key.

Note: From here on out, we won’t explicitly explain that we need to confirm Terraform actions.

After we choose to proceed, the relevant parts of the output should be as follows.

... 
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

We can see that two resources were added and that nothing was changed or destroyed.

Since this is the first time we created a resource with Terraform, it’s reasonable to be skeptical that everything worked perfectly. So, we’ll confirm that the bucket was indeed created by listing all those available. Over time, we’ll gain confidence in Terraform and won’t have to validate that everything works correctly.

Command to view the created bucket

In our case, the output is as follows.

Viewing the buckets created

Let’s imagine that someone else executed terraform apply and that we’re not sure what the state of the resources is. In such a situation, we can consult Terraform by asking it to show us the state using terraform show.

The output is as follows.

Output of terraform show

There’s not much to look at. For now, we only have one resource (aws_s3_bucket). As we keep progressing, that output will increase and, more importantly, it will always reflect the state of the resources managed by Terraform.

The previous output is a human-readable format of the state currently stored in terraform.tfstate. We can inspect that file as well using cat terraform.tfstate. The output is as follows.

Output of terraform.tfstate

If we ignore the fields that are currently empty, and the few that are for Terraform’s internal usage, we can see that the state stored in that file contains the same information as what we saw by using terraform show. The only important difference is that one is in Terraform’s internal format (terraform.tfstate), while the other (terraform show) is meant to be readable by humans.

Even though it’s not the case right now, the state could easily contain confidential information. It’s currently stored locally, and we already decided to move it to AWS S3 bucket. That way we’ll be able to share it, it will be stored in a more reliable location, and it will be more secure.

Devops
Devops
S3 storage bucket
S3 storage bu...
Terraform Apply
Terraform Apply
Push
Push
Pull
Pull
Stores
Stores
*.tfstate
*.tfstate
Viewer does not support full SVG 1.1
Overview of DevOps interacting with storage bucket

Moving the state to the bucket#

To move the state to the bucket, we’ll create an S3 backend. As you can probably guess, we’ve already prepared a file just for that.

The output is as follows.

Viewing backend.tf#

Output of backend.tf

There’s nothing special in that definition. We’re setting the name of the bucket and the key, which will be appended to the files.

The bucket entry in that Terraform definition cannot be set to a value of a variable. It needs to be hard-coded. So, we’ll need to replace devops-catalog with the bucket name we used when we created it.

Replacing the bucket name

Let’s apply the definitions and see what we get.

terraform apply

The output, limited to the relevant parts, is as follows.

Output of terraform apply

Since we’re changing the location where Terraform should store the state, we have to initialize the project again. The last time we did that, it was because a plugin (aws) was missing. This time it’s because the init process will copy the state from the local file to the newly created bucket using the following command.

terraform init

The output, limited to the relevant parts, is as follows.

Output of terraform init

Confirm copying the state by typing “Yes” and pressing the “Enter” key. The process continues. It copies the state to the remote storage, which, from now on, will be used instead of the local file. Now we should be able to apply the definitions.

terraform apply 

The output is as follows.

Output after terraform apply

As we can see, there was no need to apply the definitions. The latest addition does not define any new resources. We only added the location for the Terraform state. That change is internal, and it was applied through the init process.

Try it yourself#

You can try all of the commands used in this lesson in the code playground below. Press the “Run” button and wait a few seconds for it to connect.

All of the commands above are combined in main.sh for ease of use. If you already have a unique bucket name, please enter it in the environment variable TF_VAR_state_bucket of the code playground below. Otherwise, first, use the commands described earlier in this lesson to create a unique bucket name.

Please provide values for the following:
AWS_TF_VAR_state_bucket
Not Specified...
AWS_ACCESS_KEY_ID
Not Specified...
AWS_SECRET_ACCESS_KEY
Not Specified...
/
main.sh
provider.tf
variables.tf
storage.tf
terraform.tfstate
files
Code playground

Troubleshooting#

If you receive the Bucket already exists error while executing any of the commands in this lesson or in the next lessons. Run the following command followed by terraform apply

Terraform Providers

Creating the Control Plane